Systems and Methods for Developing Diagnostic Tests Based on Biomarker Information from Legacy Clinical Sample Sets

ABSTRACT

Disclosed are systems and methods for developing diagnostic tests (e.g., detection, screening, monitoring, and prognostic tests) based on biomarker information from legacy clinical sample sets, for which only small sample volumes (e.g., about 0.05 to about 1.0 mL or less per sample) are typically available. For example, biomarkers (e.g., about 10, 50, 100, 150, 200, 300, or more) may be detected in the clinical samples through the use of single molecule detection and each biomarker may be detected in an assay that includes about 1 or less of a legacy clinical sample.

PRIORITY CLAIM

This application is a continuation of U.S. patent application Ser. No.15/007,786, filed on Jan. 27, 2016, which is a continuation of U.S.patent application Ser. No. 13/746,216, filed on Jan. 21, 2013, which isa divisional of U.S. patent application Ser. No. 13/561,913, filed Jul.30, 2012 (now U.S. Pat. No. 8,357,497), which is a divisional of U.S.patent application Ser. No. 12/300,019, filed Jun. 29, 2009 (now U.S.Pat. No. 8,232,065), which is an U.S. 371 National Stage application ofPCT Application No. PCT/US2007/011196, filed May 8, 2007, which claimspriority to U.S. Provisional Patent Application No. 60/798,867, filedMay 8, 2006, the entire contents of each which are incorporated hereinby reference and relied upon.

FIELD OF THE INVENTION

Embodiments of the present invention relate to systems and methods fordeveloping diagnostic tests and, more specifically, to systems andmethods for developing diagnostic tests based on biomarker informationfrom legacy clinical sample sets, for which only small sample sizes(e.g., about 0.05 to 1.0 mL per sample) are typically available. In apreferred embodiment, the biomarker information is detected in theclinical samples through the use of single molecule detection.

BACKGROUND OF THE INVENTION

Diagnostic tests have been provided for detecting, screening,monitoring, and/or predicting the future development of various healthstates (e.g., disease states) in a subject. Typically, the detecting,screening, monitoring, or prognosis is provided by a diagnostic testbased, at least in part, on the level(s) of one or more biologicalmarkers (“biomarkers”) in a clinical sample taken from the subject(e.g., the subject's blood), or the presence thereof. Such biomarkersare selected because the presence, absences, or levels of suchbiomarkers alone or in combination are indicative of the presence,stage, or future clinical course of the health state. Often times, butnot necessarily, the diagnostic test may additionally be based onclinical information concerning the subject. Determining an appropriatediagnosis or prognosis for a subject can, for example, advantageouslyincrease the subject's chances for survival and/or recovery.

Diagnostic tests must undergo a development stage during which the testsare formulated (and optionally tested/validated) using previouslycollected samples stored for future research and development needs. Thisprocess is prior to their use in diagnosing or predicting thedevelopment of disease in subjects in real time. The information used toformulate and validate the tests typically comes from clinical samplesfor a cohort of subjects for whom at least some biochemical and clinicaldata is known regarding the presence or absence of the health stateunder consideration. Thus, traditionally a party who is desirous ofdeveloping a diagnostic test for a given health state is required tocommit significant resources to the collection of clinical samples (andoptionally clinical information such as medical history) from subjectswho have, and/or lack, the health state, often at various stages. Thisdata collection process can take many years, depending on the type ofdisease being considered and the party's relative access to suitablesubjects.

Traditional approaches for developing diagnostic tests also require theclinical samples that are collected to have sufficiently large volumes,and such large samples cannot always be readily obtained. Specifically,traditional biomolecular detection approaches require large samplevolumes in order to allow for the selection of a set of biomarkers thatwill be useful in the determination of a patient's health state. Of allthe biomarkers that are evaluated (e.g., 1-3, 150-300 biomarkers, or1000 or more), only those biomarkers that are determined to aid in thedetermination of the health state in a patient are included in the finaldiagnostic test. For example, according to one approach,single-biomarker multiple ELISAs used to measure the presence or levelof 300 biomarkers typically require a serum or plasma sample size ofabout 30 mL of specimen per individual (i.e., 100 uL per assay times 300biomarkers). The required sample volume becomes 90 mL of specimen perindividual if the assays are done in triplicate. This is a very largevolume and is very impractical. In addition, few studies have ever beenconducted where so much clinical sample was collected. Multiplexing,which involves measuring multiple biomarkers in the same reactionvessel, can reduce the overall required sample volume by way ofconservation but requires compatibility between all the assay componentsand typically compromises sensitivity through increased backgroundeffects. As a result, on an assay by assay basis, individual assays aretypically 10 or more fold more sensitive than their counterpart within amultiplexed assay.

In view of the foregoing, it would be desirable to provide systems andmethods for developing diagnostic tests in which access to suitableclinical samples is improved and which rely on smaller sample volumes.

SUMMARY OF THE INVENTION

The above and other objects and advantages of the present invention areprovided in accordance with the principles of the present inventiondescribed herein. Embodiments of the present invention relate to systemsand methods for developing diagnostic tests based on biomarkerinformation from legacy clinical sample sets, for which only smallsample volumes (e.g., about 0.05 to 1.0 mL per individual) are typicallyavailable. As used herein, a “legacy clinical sample set” is one or moreclinical samples (e.g., 10 to 5000 samples or more) collected in thepast (i.e., retrospective sample collections). The use of legacyclinical samples, as opposed to performing the process of collectingclinical samples prospectively, reduces the resources and time that mustbe committed to developing new diagnostic tests. Legacy clinical samplesmay be from, for example, one or more past studies that occurred over aspan of 1 to 40 years or more, which studies may be accompanied by tensto thousands of clinical parameters, traditional laboratory measurementsthat are considered risk factors or that provide additive information toenable a better clinical decision to be made, and other previouslymeasured information (e.g., clinical data such as the subject's age,weight, ethnicity, medical history, and/or other information). In mostcases, the legacy clinical samples are serum or plasma samples that havebeen stored for years at −80 degrees Centigrade or −20 degreesCentigrade. In other examples, a legacy clinical sample can include, forexample, blood cells, ascites fluid, interstitial fluid, bone marrow,sputum, urine, or other biological sample. Examples of such paststudies, which are included for the purpose of illustration and notlimitation, are listed below:

-   -   1. DPP (Diabetes Prevention Program)—An NIH sponsored trail that        studied the impact of lifestyle modifications, metformin vs.        placebo. This study had 2.8 years follow-up with diabetes        outcomes.    -   2. IRAS (Insulin Resistance Atherosclerosis Study)—Studied the        impact of insulin resistance on the development of        cardiovascular disease.    -   3. ARIC (Atherosclerosis Risk in Communities Study)—This study        includes CVD and cardiovascular outcomes.    -   4. Finnish Diabetes Prevention Study—studied the impact of        lifestyle changes on the development of diabetes.    -   5. Israeli Diabetes Research Group (MELANY)—Studied the        development of diabetes in healthy normal subjects from the        Israeli military    -   6. HDDRISC (Heart Disease and Diabetes Risk Indicators in a        Screened Cohort)—collection of diabetes and cardiovascular        outcomes.    -   7. WSCOPS (West of Scotland Coronary Prevention Study)—studied        the impact of pravastatin on reduction of LDL and reduction in        myocardial events    -   8. ASCOT (Anglo-Scandinavian Cardiac Outcomes Trial)—studied the        impact of different medicines for lowering blood pressure and        cholesterol. CVD outcomes collected.    -   9. SOF (Study of Osteoporotic Fractures)—Study looks for        predictors of fracture in women over 65 years of age    -   10. NORA (National Osteoporosis Risk Assessment)—Studied        fracture outcomes in women with varying BMD levels.    -   11. Framingham Heart Study—Related to identifying the common        factors or characteristics that contribute to CVD by following        its development over a long period of time in a large group of        participants who had not yet developed overt symptoms of CVD or        suffered a heart attack or stroke.    -   12. CARDIA—(Coronary Artery Risk Development in Young Adults) A        longitudinal study designed to trace the development of risk        factors for coronary heart disease in a cohort of 18-30 year        olds (1985) in four U.S. cities.    -   13. Reykjavik Study—A long-term prospective population-based        cardiovascular study of 33-79 year olds with 4 to 20 year        follow-up (1967-91), in Iceland.    -   14. Malmo Preventive Project—A prospective, population-based        study of the effects of interventions on mortality and        cardiovascular morbidity in 32-51 year olds (1974-1992) in        Sweden    -   15. Heart Protection Study—A very large, prospective,        double-blind, randomized, controlled trial investigating        prolonged use (>5 years) of a statin and an antioxidant vitamin        cocktail in individuals 40 to 80 years old in the United Kingdom        who had an elevated risk for CHD.    -   16. 4S (Scandinavian Simvastatin Survival Study)—Large        double-blind, randomized trial designed to evaluate the effect        of a statin on mortality and morbidity in patients with coronary        heart disease (CHD).    -   17. DREAM (Diabetes Reduction Assessment with ramipril and        rosiglitazone Medication) Study—A large, double-blind,        randomized, placebo-controlled trial evaluating the effects of        an ACE inhibitor and/or a thiazolidinedione on the development        of diabetes, death, or regression to normoglycaemia in adults        aged 30 years or more with impaired fasting glucose and/or        impaired glucose tolerance, and no previous cardiovascular        disease.    -   18. Physician's Health Study—a large cohort of apparently        healthy male U.S. physicians aged 40 to 84 years in 1982,        followed prospectively for an average of 60.2 months    -   19. WHI (Women's Health Initiative)—A very large, prospective        study, involving both clinical trial and observational        components, of women 50 to 79 years of age in the U.S., and is        designed to examine the relationship between health, lifestyle,        and risk factors for a variety of specific diseases, including        CHD    -   20. WHS (Women's Health Study) A very large, double-blind,        randomized, placebo-controlled trial to evaluate the effects of        vitamin E and low-dose aspirin on cardiovascular disease and        cancer in apparently healthy U.S. women, age 45 and older, which        also included an observational extension    -   21. NHS (Nurses' Health Study)—A very large, prospective cohort        study of nurses aged 30-55 (in 1976) designed to assess the long        term effects of oral contraceptive use    -   22. NHS II (Nurses' Health Study II)—A very large, prospective        cohort study of nurses aged 25-42 (in 1989) designed to assess        the long term effects of oral contraceptives, diet and lifestyle        risks.

In an embodiment of the present invention, methods and systems areprovided for developing a diagnostic test for determining a health statein a patient (e.g., a test for a predicting or diagnosing disease suchas diabetes, osteoporosis, pre-osteoporosis, or any other disease), inwhich at least one biomarker is detected in at least one legacy clinicalsample. For example, the biomarker may be detected in an immunoassaythat includes about 1 uL or less of the legacy clinical sample. Thedetection may be performed by, for example, a single molecule detector.Typically, although not necessarily, developing a new diagnostic testcomprises detecting multiple biomarkers from multiple clinical samples,including samples from subjects known have a given health state, or withrespect to reference ranges from a known normal population. The detectedbiomarker(s) are then analyzed for an association with the health state.For example, a statistical analysis may be performed to determinewhether the biomarker statistically correlates with the presence orabsence of the health state, or alternatively correlates with theexisting gold standard (whether biomarker, clinical parameter, orotherwise) used for defining the presence of the health state (forexample, fasting glucose level for diabetes, blood pressure forhypertension as a health state, or coronary imaging scores or percentageocclusions/stenosis for coronary artery disease). Alternatively oradditionally, the analysis may involve determining whether the inclusionof the biomarker in a formula or machine learning analysis increases anability of a mathematical function resulting from the machine learninganalysis to determine the health state in a patient.

In another embodiment, clinical parameters (e.g., age, weight,ethnicity, medical history, and/or other clinical information) thataccompany the legacy clinical sample(s) may also be analyzed for anassociation with the health state.

In yet another embodiment, methods and systems are provided fordeveloping a diagnostic test for determining a health state in apatient, in which a plurality of biomarkers (e.g., 10-300 biomarkers)are detected in a legacy clinical sample through the use of acorresponding plurality of immunoassays, where the total amount of thelegacy clinical sample that is used across the plurality of immunoassaysis less than about 1 mL (e.g., less than about 0.05 mL). Typically,multiple legacy clinical samples are analyzed in the same fashion, andthe detected biomarkers are then analyzed for an association with thedisease.

In another embodiment, a diagnostic test is used to screen or monitor apatient for a given health state. The test is developed using any of themethods disclosed herein for screening legacy clinical samples. Forexample, at least one biomarker indicative of the presence, absence, orlikelihood of developing the health state and identified by the methodsdescribed herein is employed in the test and its presence, absence, orlevel is determined.

Other features and advantages of the invention will be apparent from thefollowing detailed description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the present invention, and not intendingto limit the scope of the invention in any way, reference is made to thefollowing description, taken in conjunction with the accompanyingdrawings, in which like reference characters refer to like partsthroughout, and in which:

FIGS. 1 and 2 are illustrative diagrams of a single molecule detector inaccordance with an embodiment of the present invention;

FIG. 3 is a flowchart of illustrative stages involved in developing adiagnostic test in accordance with an embodiment of the presentinvention;

FIG. 4 shows a typical result for a working standard curve used in thedevelopment of immunoassays in accordance with an embodiment of thepresent invention;

FIG. 5 shows illustrative single molecule detection data in accordancewith an embodiment of the present invention;

FIG. 6 shows a table indicating the actual number of analyte moleculespresent in a sample across the ranges of various sample sizes andstarting analyte molar concentrations; and

FIG. 7 shows, without intending any limitation, the detection limit ofselected biomarker assay technologies that are commercially available,indicating their typical analytical reproducibility performancecharacteristic (coefficient of variation) at these starting analyteconcentrations. There are many additional technologies being applied toimprove the sensitivity of single molecule detection, includingmicroscopic techniques (atomic force microscopy, magnetic resonanceforce microscopy, scanning electrochemical microscopy, scanningtunneling microscopy) and spectroscopic techniques (fluorescencecorrelation spectroscopy, evanescent wave induced fluorescencespectroscopy, scanning near-field optical microscopy, scanning enhancedraman spectroscopy, surface plasma resonance).

DETAILED DESCRIPTION OF THE INVENTION

Embodiments of the present invention relate to systems and methods fordeveloping diagnostic tests for diagnosing, and predicting the futuredevelopment of, various health states (e.g., health states includingdisease-specific states as well as other non-disease specific states)ina subject. Examples of diseases are osteoporosis, pre-osteoporosis,diabetes, cancer, and any other disease. In one embodiment of thepresent invention, systems and methods are provided for developingdiagnostic tests based on biomarker information from legacy clinicalsample sets, for which only small sample sizes (e.g., about 0.05 to 1.0mL or less) are typically available. In a preferred embodiment, thebiomarker information is extracted from the clinical samples through theuse of single molecule detection.

Definitions

“Biomarker” in the context of the present invention encompasses, withoutlimitation, proteins, nucleic acids, and metabolites, together withtheir polymorphisms, isoforms, mutations, derivatives, variants,modifications, and precursors, including nucleic acids and pro-proteins,cleavage products, receptors (including soluble and transmembranereceptors), subunits, fragments, ligands, protein-ligand complexes,mulitmeric complexes, and degradation products, elements, relatedmetabolites, and other analytes or sample-derived measures. Biomarkerscan also include mutated proteins or mutated nucleic acids. Biomarkersalso include any calculated indices created mathematically orcombinations of any one or more of the foregoing measurements, includingtemporal trends and differences. The term “analyte” as used herein canmean any substance to be measured and can encompass electrolytes andelements, such as calcium.

“Clinical parameters” encompasses all non-sample or non-analyte markersof subject health status or other characteristics, such as, withoutlimitation, age (AGE), ethnicity (RACE), gender (SEX), diastolic bloodpressure (DBP) and systolic blood pressure (SBP), family history (FHX),height (HT), weight (WT), waist (Waist) and hip (Hip) circumference,body-mass index (BMI), past Gestational Diabetes Mellitus (GDM), restingheart rate, EMG, EEG, body temperature, and sleep states.

A “formula,” “algorithm,” or “model” is any mathematical equation,algorithmic, analytical or programmed process, or statistical techniquethat takes one or more continuous or categorical inputs (herein called“parameters”) and calculates an output value, sometimes referred to asan “index” or “index value”. Non-limiting examples of “formulas” includesums, ratios, and regression operators, such as coefficients orexponents, biomarker value transformations and normalizations(including, without limitation, those normalization schemes based onclinical parameters, such as gender, age, or ethnicity), rules andguidelines, statistical classification models, and neural networkstrained on historical populations. Of particular use in combiningmarkers are linear and non-linear equations and statisticalclassification analyses to determine the relationship between levels ofthe biomarkers detected in a subject sample and the subject's risk ofdisease (for example). In panel and combination construction, ofparticular interest are structural and synactic statisticalclassification algorithms, and methods of risk index construction,utilizing pattern recognition features, including established techniquessuch as cross correlation, Principal Components Analysis (PCA), factorrotation, Logistic Regression (LogReg), Linear Discriminant Analysis(LDA), Eigengene Linear Discriminant Analysis (ELDA), Support VectorMachines (SVM), Random Forest (RF), Recursive Partitioning Tree (RPART),as well as other related decision tree classification techniques,Shruken Centroids (SC), StepAIC, Kth-Nearest Neighbor, Boosting,Decision Trees, Neural Networks, Bayesion Networks, Support VectorMachines, and Hidden Markov Models, among others. Many of thesetechniques are useful either combined with a biomarker selectiontechnique, such as forward selection, backwards selection, or stepwiseselection, complete enumeration of all potential panels of a given size,genetic algorithms, or they may themselves include biomarker selectionmethodologies in their own technique. These may be coupled withinformation criteria, such as Akaike's Information Criterion (AIC) orBayes Information Criterion (BIC), in order to quantify the tradeoffbetween additional biomarkers and model improvement, and to aid inminimizing overfit. The resulting predictive models may be validated inother studies, or cross-validated in the study they were originallytrained in, using such techniques as Leave-One-Out (LOO) and 10-Foldcross-validation (10-Fold-CV).

“Frank Disease” in the context of the present invention, is a clearlymanifest, unmistakable, evident, or symptomatic disease state thatunequivocally meets the definition of the disease set forth by aprofessional medical organization, such as the World HealthOrganization.

“Health state” encompasses disease states (e.g., presence, absence, orrisk of developing a disease and likely responses to therapies for thedisease) as well as other states not necessarily related to a specificdisease such as environmental exposure, nutritional status, neurologicalfunction, immune status, organ function, and blood chemistry. Generally,determining a health state in a patient/subject involves determiningthat the patient should be classified within a given one of a pluralityof populations (e.g., healthy vs. unhealthy, in a 2-population example).

A “legacy subject” is a subject (defined below) for which one or moreclinical samples is included in a legacy clinical sample set.

A “live subject” is a subject for whom a determination (e.g., diagnosisor prognosis of disease) is made by a diagnostic test that has beendeveloped in accordance with the principles of the present invention.

A “legacy clinical sample” is a clinical sample for an individual from alegacy clinical sample set (which set may have multiple samples formultiple individuals), where the volume of the sample meets a samplerequirement (defined below) and the biomarker information from thesample may be used to develop a diagnostic test in accordance with theprinciples of the present invention.

A “live clinical sample” is a clinical sample from which biomarkerinformation is evaluated by a diagnostic test in order to provide adetermination (e.g., diagnosis or prognosis) for a corresponding livesubject.

“Measuring” or “measurement” means assessing the presence, absence,quantity or amount (which can be an effective amount) of either a givensubstance within a clinical or subject-derived sample, including thederivation of qualitative or quantitative concentration levels of suchsubstances, or otherwise evaluating the values or categorization of asubject's clinical parameters. Alternatively, the term “detecting” or“detection” may be used and is understood to cover all measuring ormeasurement as described herein.

“Risk” in the context of the present invention, relates to theprobability that an event will occur over a specific time period (e.g.,conversion to frank Diabetes) and can can mean a subject's “absolute”risk or “relative” risk. Absolute risk can be measured with reference toeither actual observation post-measurement for the relevant time cohort,or with reference to index values developed from statistically validhistorical cohorts that have been followed for the relevant time period.Relative risk refers to the ratio of absolute risks of a subjectcompared either to the absolute risks of low risk cohorts or an averagepopulation risk, which can vary by how clinical risk factors areassessed. Odds ratios, the proportion of positive events to negativeevents for a given test result, are also commonly used (odds areaccording to the formula p/(1−p) where p is the probability of event and(1−p) is the probability of no event) to no-conversion. Alternativecontinuous measures which may be assessed in the context of the presentinvention include time to health state (e.g., disease) conversion andtherapeutic conversion risk reduction ratios.

“Pre-Disease” in the context of the present invention refers to a statethat is intermediate between that defined as the normal homeostatic andmetabolic state and states seen in Frank Disease. Pre-disease states caninclude abnormalities of homeostatic regulation, abnormal physiologicalmeasurements, abnormal morphometric measurements, and/or states in whichabnormal levels of clinical parameters or biomarkers are present at aspecific time point. Abnormalities are measurement outside the normalrange as defined by professional medical organizations, such as theWorld Health Organization. “Pre-Disease” states, in the context of thepresent invention, are states, in an individual or in a population,having a higher than normal expected rate of disease conversion to frankdisease. When a continuous measure of Pre-Disease conversion risk isproduced, having a “pre-disease condition” encompasses any expectedannual rate of conversion above that seen in a normal reference orgeneral unselected normal prevalence population.

“Risk evaluation,” or “evaluation of risk” in the context of the presentinvention encompasses making a prediction of the probability, odds, orlikelihood that an event or health state may occur, the rate ofoccurrence of the event or conversion from one health state to another(e.g., from a normoglycemic condition to a pre-diabetic condition orpre-Diabetes, or from a pre-diabetic condition to pre-Diabetes orDiabetes). Risk evaluation can also comprise prediction of futurelevels, scores or other indices of disease, either in absolute orrelative terms in reference to a previously measured population. Themethods of the present invention may be used to make continuous orcategorical measurements of the risk of conversion between healthstates. Embodiments of the invention can also be used to discriminatebetween normal and pre-diseased subject cohorts. In other embodiments,the present invention may be used so as to discriminate pre-diseasedfrom diseased, or diseased from normal. Such differing use may requiredifferent biomarker combinations in individual panel, mathematicalalgorithm(s), and/or cut-off points, but be subject to the sameaforementioned measurements of accuracy for the intended use.

A “sample” in the context of the present invention is a biologicalsample isolated from a subject and can include, by way of example andnot limitation, whole blood, serum, plasma, blood cells, endothelialcells, tissue biopsies, lymphatic fluid, ascites fluid, interstitialfluid (also known as “extracellular fluid” and encompasses the fluidfound in spaces between cells, including, inter alia, gingivalcrevicular fluid), bone marrow, cerebrospinal fluid (CSF), saliva,mucous, sputum, sweat, urine, or any other secretion, excretion, orother bodily fluids.

A “sample requirement” is the volume of starting sample required by agiven assay technology in order to achieve an acceptable level ofperformance (coefficient of variation).

A “subject” in the context of the present invention is preferably amammal. The mammal can be a human, non-human primate, mouse, rat, dog,cat, horse, or cow, but are not limited to these examples. Mammals otherthan humans can be advantageously used as subjects that represent animalmodels of disease, pre-disease, or a pre-disease condition. A subjectcan be male or female. A subject can be one who has been previouslydiagnosed or identified as having a health state (e.g., disease,pre-disease, or a pre-disease condition), and optionally has alreadyundergone, or is undergoing, a therapeutic intervention for the healthstate. Alternatively, a subject can also be one who has not beenpreviously diagnosed as having a given health state. For example, asubject can be one who exhibits one or more risk factors for a disease,pre-disease, or a pre-disease condition, or a subject who does notexhibit disease risk factors, or a subject who is asymptomatic for adisease, pre-disease, or pre-disease conditions. A subject can also beone who is suffering from or at risk of developing disease, pre-disease,or a pre-disease condition.

“Traditional laboratory risk factors” correspond to biomarkers isolatedor derived from subject samples and which are currently evaluated in theclinical laboratory and used in traditional global risk assessmentalgorithms (e.g., Stern, Framingham, Finland Diabetes Risk Score, ARICDiabetes, and Archimedes). Traditional laboratory risk factors commonlytested from subject blood samples include, but are not limited to, totalcholesterol (CHOL), LDL (LDL/LDLC), HDL (HDL/HDLC), VLDL (VLDLC),triglycerides (TRIG), glucose (including, without limitation, thefasting plasma glucose (Glucose) and the oral glucose tolerance test(OGTT)) and HBA1c (HBA1C) levels.

INDICATIONS OF THE INVENTION

Embodiments of the present invention allow for the determining of ahealth state in a patient. For example, the risk of developing disease,pre-disease, or a pre-disease condition typically can be detected with apre-determined level of predictability by measuring an “effectiveamount” of a biomarker in a test sample (e.g., a subject derivedsample), and comparing the effective amounts to reference or indexvalues, often utilizing mathematical algorithms or formulas in order tocombine information from results of multiple individual biomarkers andfrom non-analyte clinical parameters into a single measurement or index.When appropriate, subjects identified as having an increased risk for ahealth state can optionally be selected to receive treatment regimens,such as administration of prophylactic or therapeutic compounds, orimplementation of exercise regimens or dietary supplements to prevent ordelay the onset of, for example, disease, pre-disease, or a pre-diseasecondition or other adverse health conditions.

The amount of the biomarker can be measured in a test sample andcompared to a normal control level, utilizing techniques such asreference limits, discrimination limits, or risk defining thresholds todefine cutoff points and abnormal values for a health state. The normalcontrol level means the level of one or more biomarkers or combinedbiomarker indices typically found in a subject not having the healthstate. Such normal control level and cutoff points may vary based onwhether a biomarker is used alone or in a formula combining with otherbiomarkers into an index. Alternatively, the normal control level can bea database of biomarker patterns from previously tested subjects who didnot convert to the health state over a clinically relevant time horizon.

The present invention may be used to make continuous or categoricalmeasurements of the risk of conversion to an adverse health state (e.g.,disease), thus diagnosing and defining the risk spectrum of a categoryof subjects defined as predisposed to the adverse health state. In thecategorical scenario, the methods of the present invention can be usedto discriminate between (for example) normal and pre-diseased subjectcohorts. In other embodiments, the present invention may be used so asto discriminate pre-disease from disease, or diseased from normal. Othernon-disease specific health states can also be determined. Suchdiffering use may require different biomarker combinations in individualpanel, mathematical algorithm, and/or cut-off points, but be subject tothe same aforementioned measurements of accuracy for the intended use.

Identifying patients that are predisposed to adverse health states(e.g., pre-disease states) enables the selection and initiation ofvarious therapeutic interventions or treatment regimens in order todelay, reduce or prevent those patients' conversion to the adversehealth states (e.g., disease). Levels of a specific amount of biomarkeralso may allow for the course of treatment of the health state (e.g.,disease, pre-disease, or a pre-disease condition) to be monitored. Forexample, in this method, a biological sample can be provided from asubject undergoing treatment regimens, e.g., drug treatments, for adisease. Such treatment regimens can include, but are not limited to,exercise regimens, dietary supplementation, weight loss, surgicalintervention, device implantation, and treatment with therapeutics orprophylactics used in subjects diagnosed or identified with varioushealth states. If desired, biological samples are obtained from thesubject at various time points before, during, or after treatment.

The present invention can also be used to screen patient or subjectpopulations in any number of settings. For example, a health maintenanceorganization, public health entity or school health program can screen agroup of subjects to identify those requiring interventions, asdescribed above, or for the collection of epidemiological data.Insurance companies (e.g., health, life or disability) may screenapplicants in the process of determining coverage or pricing, orexisting clients for possible intervention. Data collected in suchpopulation screens, particularly when tied to any clinical progressionto conditions like disease, pre-disease, or a pre-disease condition,will be of value in the operations of, for example, health maintenanceorganizations, public health programs and insurance companies. Such dataarrays or collections can be stored in machine-readable media and usedin any number of health-related data management systems to provideimproved healthcare services, cost effective healthcare, improvedinsurance operation, etc. See, for example, U.S. Patent Application No.;U.S. Patent Application No. 2002/0038227; U.S. Patent Application No. US2004/0122296; U.S. Patent Application No. US 2004/0122297; and U.S. Pat.No. 5,018,067, which are hereby incorporated by reference herein intheir entireties. Such systems can access the data directly frominternal data storage or remotely from one or more data storage sites.Thus, in a health-related data management system, wherein risk ofdeveloping a diabetic condition for a subject or a population comprisesanalyzing disease risk factors, the present invention provides animprovement comprising use of a data array encompassing the biomarkermeasurements as defined herein and/or the resulting evaluation of riskfrom those biomarker measurements.

A machine-readable storage medium can comprise a data storage materialencoded with machine readable data or data arrays which, when using amachine programmed with instructions for using said data, is capable ofuse for a variety of purposes, such as, without limitation, subjectinformation relating to health state risk factors over time or inresponse to drug therapies, drug discovery, and the like. Measurementsof effective amounts of the biomarkers of the invention and/or theresulting evaluation of risk from those biomarkers can be implemented incomputer programs executing on programmable computers, comprising, interalia, a processor, a data storage system (including volatile andnon-volatile memory and/or storage elements), at least one input device,and at least one output device. Program code can be applied to inputdata to perform the functions described above and generate outputinformation. The output information can be applied to one or more outputdevices, according to methods known in the art. The computer may be, forexample, a personal computer, microcomputer, or workstation ofconventional design.

Each program can be implemented in a high level procedural or objectoriented programming language to communicate with a computer system.However, the programs can be implemented in assembly or machinelanguage, if desired. The language can be a compiled or interpretedlanguage. Each such computer program can be stored on a storage media ordevice (e.g., ROM or magnetic diskette or others as defined elsewhere inthis disclosure) readable by a general or special purpose programmablecomputer, for configuring and operating the computer when the storagemedia or device is read by the computer to perform the proceduresdescribed herein. The health-related data management system of theinvention may also be considered to be implemented as acomputer-readable storage medium, configured with a computer program,where the storage medium so configured causes a computer to operate in aspecific and predefined manner to perform various functions describedherein. Levels of a specific amount of one or more biomarkers can thenbe determined and compared to a reference value, e.g. a control subjector population whose state is known or an index value or baseline value.The reference sample or index value or baseline value may be taken orderived from one or more subjects who have been exposed to thetreatment, or may be taken or derived from one or more subjects who areat low risk of developing a health state (e.g., disease, pre-disease, ora pre-disease condition), or may be taken or derived from subjects whohave shown improvements in risk factors (such as clinical parameters ortraditional laboratory risk factors as defined herein) as a result ofexposure to treatment. Alternatively, the reference sample or indexvalue or baseline value may be taken or derived from one or moresubjects who have not been exposed to the treatment. For example,samples may be collected from subjects who have received initialtreatment for disease, pre-disease, or a pre-disease condition andsubsequent treatment for disease, pre-disease, or a pre-diseasecondition to monitor the progress of the treatment. A reference valuecan also comprise a value derived from risk prediction algorithms orcomputed indices from population studies such as those disclosed herein.

The biomarkers of the present invention can thus be used to generate areference biomarker profile of those subjects who do not have a healthstate (e.g., impaired glucose tolerance in the case of Diabetes), andwould not be expected to develop the health state. The biomarkersdisclosed herein can also be used to generate a subject biomarkerprofile taken from subjects who have a health state such as disease,pre-disease, or a pre-disease condition. The subject biomarker profilescan be compared to a reference biomarker profile to diagnose or identifysubjects at risk for developing the health state, to monitor theprogression of the health state (e.g., disease), as well as the rate ofprogression of the health state, and to monitor the effectiveness of anytreatments for the health state. The reference and subject biomarkerprofiles of the present invention can be contained in a machine-readablemedium, such as but not limited to, digital and analog media like thosereadable by a VCR, CD-ROM, DVD-ROM, USB flash media, among others. Suchmachine-readable media can also contain additional test results, suchas, without limitation, measurements of clinical parameters andtraditional laboratory risk factors. Alternatively or additionally, themachine-readable media can also comprise subject information such asmedical history and any relevant family history. The machine-readablemedia can also contain information relating to other disease-riskalgorithms and computed indices such as those described herein.

A diagnostic test that is developed in accordance with the principles ofthe present invention can be used to make a determination for a livesubject (e.g., a diagnosis or prognosis) based, at least in part, on thepresence or level(s) of one or more biomarkers present in a liveclinical sample from the live subject. The levels are determined, as isunderstood to those of ordinary skill in the art, within the sensitivityand specificity parameters of the test format selected (e.g., abiomarker is “absent” if its level is below the test's limit ofdetection or some other cut-off value). For example, one such diagnostictest may involve comparing the subject's biomarker level(s) to areference value. As another example, the diagnostic test may involveevaluating the live subject's biomarker level(s) (and optionally otherinformation for the subject such as, for example, age, weight,ethnicity, medical history, and/or other clinical information) with aformula or model that produces a diagnostic or prognostic score for thelive subject.

A diagnostic test for a given health state may be developed, at least inpart, through the use of a legacy clinical sample set. The legacyclinical sample set may include samples for a cohort of legacy subjects,for whom at least some data is known regarding the presence or absenceof the health state. For example, a diagnostic test may be developedbased on samples for legacy subjects who are known to have a givendisease. Alternatively or additionally, the diagnostic test may bedeveloped based on clinical samples for legacy subjects who are known tolack the disease or other health state.

Theoretically, an almost limitless number of biomarkers are availablefor selection within the process of developing a diagnostic test.However, only a subset of all available biomarkers (e.g., between 10 and300) are typically selected per disease area, which subset of biomarkersmay be identified by physicians and/or other sources of information(e.g., medical journals) with expertise in the disease area. Biomarkersmay also be derived from de novo research using “open” proteomicsprofiling technologies such as mass spectrometry, LC-LC massspectrometry, 2-D gel electrophoresis, protein arrays, western blots,reverse western tissue blots, etc.

In an embodiment of the present invention, systems and methods areprovided for developing a diagnostic test, according to which (i) a setof one or more legacy clinical samples is received (e.g., 50 to 5000legacy samples), (ii) the levels of a selected subset of biomarkers aremeasured from the sample(s), and (iii) the biomarker levels (andoptionally clinical parameters) are analyzed for an association with thehealth state under consideration. This analyzing may involve, forexample, using statistical analysis to determine whether a particularone or more biomarkers (and optionally particular level(s) of thosebiomarkers and/or clinical parameters) is correlated statistically withthe presence, absence, or risk of developing the health state (e.g.,progression to disease states of different severities), and/or to selectone or more therapies or to monitor therapy response/efficacy. In someembodiments, a biomarker panel can be constructed and a formula derivedspecifically to enhance performance for use also in subjects undergoingtherapeutic interventions, or a separate panel and formula mayalternatively be used solely in such patient populations. An aspect ofthe invention is the use of specific known characteristics of biomarkersand their changes in such subjects for such panel construction andformula derivation. Such modifications may enhance the performance ofvarious indications noted above in prevention of adverse health states,and diagnosis, therapy, monitoring, and prognosis of a health state. Thebiomarkers may vary under therapeutic intervention for the health state,whether lifestyle (e.g. diet and exercise), surgical (e.g., bariatricsurgery) or pharmaceutical (e.g., one of the various classes of drugsmentioned herein or known to modify common risk factors or risk ofdisease) intervention. The biomarkers may also vary based onenvironmental exposure, nutritional status, neurological function,immune status, organ function, and/or blood chemistry. Alternatively oradditionally, the analyzing of the biomarker may involve determiningwhether the inclusion of particular biomarker(s) in a formula or machinelearning analysis (e.g., support vector or neural network analysis)increases the relative ability of a mathematical function resulting fromthe analysis to diagnose or predict the health state in a subject.Generally, machine learning is a form of artificial intelligence wherebyinformation learned from a computer-assisted analysis of data can beused to generate a function that describes dependencies in data. Thiscomputer-assisted, machine learning analysis may be performed by anysuitable software, hardware, or combination thereof (a “machine learningtool”). Suitable examples of machine learning tools will be apparent tothose of ordinary skill in the art and therefore will not be describedin detail.

A key feature of embodiments of the invention is the ability to profiletens, hundreds or even thousands of biomarkers in a single small legacysample. It will be apparent that the invention thus allows the profilingof several classes of biomarkers, and the testing of multiple members ofeach class, in order to gain insight into the biological mechanisms of ahealth state and the interaction of such biomarkers. In the preferredembodiment, this encompasses two or more biomarker members per class,more preferably five or more, and most preferably ten or more. As willbe appreciated by one skilled in the art, such classes include, withoutlimitation, cytokines and chemokines, such as chemoattractants andinflammatory molecules such as acute phase reactants, signalingmolecules, adhesion molecules, biomarkers of immunity (includingsubclasses, such as those related to individual immune cell lines suchas macrophages, T-cells, neutrophils, eiosinophils, etc), biomarkers ofangiogenesis and endothelial function, and biomarkers of glucose andlipid metabolism and energy storage. Several of these classes overlap,in particular with respect to the cytokine, chemokine, and growth factormembers of each. Selected representative examples of such classes andtheir members are given in the table below, without limiting theforegoing in any way.

Examples of Classes of Molecules Examples of Genes and Molecules in theClass Acute Phase Reactants SAA1, CRP, IL1, IL6, IL8, TNFA, FTL, A2M.MBL, SAP Angiogenesis & VEGF, CD36, ANG1, ANG2/ANGPT2, ENG, FGF2,Endothelial Function PDGF Cell Adhesion ICAM, DPP4/CD26, CD38, SELE,SELP, CD62L, VCAM, ITGA1, ITGA2, ITGA4, ITGAL, ITGAX, ITGB1, ITGB2,ITGB3 Cell Proliferation & Death AKT1, CASP2, CASP8, CASP9, IGF,TNF/TNFA, TNFR1, TNFSF10, TNFSF11, CDK2, FAS, FASLG Chemokines CCL1,MCP-1/CCL2, CCL3, CCL4, CCL5, CCL6, MCP- 3/CCL7, MCP-2/CCL8, CCL9,CCL11, CCL12, MCP- 4/CCL13, CCL19, CCL21, CCL24, CCL26, CCL27, CXCL1,CXCL2, IP10/CXCL10, IL8, CX3CL1 Cytokines IL1B, IL1RN, IL2, IL3, IL4,IL5, IL6, IL8, IL10, IL12, IL12B, IL13, IL18, BTC, TGFA, TGFB, TNF,CSF1, CSF2, CSF3, IFNG Coagulation C2, C3, C4, C5, C9, C1, F2, F12,PROC, PROS1, SERPING1, FGA, VWF, D-dimer Growth Factors & EGF, GH1,NGFB, ADIPOQ, IGF, CSF1, CSF2, CSF3, Hormones PDGF, EPO, FGF2, GDF8,GDF9, GH1, IGF1, TGFB1, TPO, EFG, HGF, FGF, IGF, BMP1, BMP2, BMP3, BMP7Inflammation CSF1, CSF2, CSF3, IFNG, CD40LG, CD40, C3, C5A, TNF, IL1,IL8, SELP, Lipid Metabolism Lipoprotein(a), LEP, ADIPOQ, AGRP, NPYEnergy Homeostasis INS, glucose, HBA1c, C-peptide, IGF-1, AKT2Proteolysis MMP2, MMP9, SERPINA1, heparin, SERPIND1, PAI- 1/SERPINE1,TIMP1, TIMP2, CASP3

Another key aspect of the invention is, in a preferred embodiment,utilizing a single molecule detector, with the ability to range multipleorders of concentration magnitude by using the stochastic and quantumnature of single molecule detection. In particular, biomarkers withinthe plasma proteome, including many of those cited above, are known tospan many orders of magnitude in their molar concentration, as seen inthe literature. Without limitation of the foregoing, a review of suchconcentrations cited from literature for cardiovascular and cancerrelated plasma proteins is described in Anderson, “Candidate-BasedProteomics in the Search for Biomarkers of Cardiovascular Disease”, JPhysiol 563.1 pp 23-60 (2005), and Anderson, “A List of Candidate CancerBiomarkers for Targeted Proteomics”, Biomarker Insights 2: 1-48 (2006),which are hereby incorporated by reference herein in their entireties.As shown in the table below and in FIG. 6, this range of concentrationsrapidly approaches single molecule requirements, particularly whencombined with the smaller volume samples commonly available in legacyclinical sample sets.

Concentration of 50 kDa molecule pg/m1 L amol/m1 L Molecules/mL 50 mg/mL50,000,000,000 1,000,000,000 6.02 × 10¹⁷ 10 mg/mL 10,000,000,000200,000,000 1.02 × 10¹⁷ 1 mg/mL 1,000,000,000 20,000,000 1.02 × 10¹⁶ 100ug/mL 100,000,000 2,000,000 1.02 × 10¹⁴ 10 ug/mL 10,000,000 200,000 1.02× 10¹⁴ 1 ug/mL 1,000,000 20,000 1.02 × 10¹³ 100 ng/mL 100,000 2,000 1.02× 10¹² 10 ng/mL 10,000 200 1.02 × 10¹¹ 1 ng/mL 1,000 20 1.02 × 10¹⁰ 100pg/mL 100 2 1.02 × 10⁹  10 pg/mL 10 0.2 1.02 × 10⁸  1 pg/mL 1 0.02 1.02× 10⁷ 

Concentration ranges of common biomarkers within the plasma proteome,indicating the disagreement of biomarker discovery technology such asmass spec across sample sets in the literature are also shown inAnderson et al., “The Human Plasma Proteome: History, Character, andDiagnostic Prospects”, Molecular & Cellular Proteomics 1.11, pp. 845-867(2002) and Anderson et al., “The Human Plasma Proteome: A NonredundantList Developed By Combination of Four Separate Sources”, Molecular &Cellular Proteomics 3.4, pp. 311-326 (2004), which are herebyincorporated by reference herein in their entireties. Such disagreementfurther demonstrates the different detection system needs inherent whenencountering broad concentration ranges, which may occur both acrossmany analytes and across many differing health states. FIG. 5demonstrates the practice of the invention across multiple orders ofmagnitude in concentration, and across representative biomarkers of eachof the aforementioned classes.

PRACTICE OF THE INVENTION

In a preferred embodiment, the biomarker levels are measured from theclinical sample(s) through the use of a single molecule detector.Suitable single molecule detection equipment is described in U.S. PatentApplication Publication Nos. 2004/0166514 A1, 2005/0164205 A1, and2006/0003333 A1, the disclosures of which are hereby incorporated byreference herein in their entireties. Other examples of single moleculedetectors that can be used in accordance with preferred embodiments ofthe present invention are described in U.S. Patent ApplicationPublication No. 2005/0221408, PCT Publication No. WO 2005/089524, andRichard Brown et al., “Review of Techniques for Single MoleculeDetection in Biological Applications, National Physical LaboratoryReport, 2001, the disclosures of which are hereby incorporated byreference herein in their entireties. Generally, a single moleculedetector operates under the principle that the ultimate, and desired,detection of biomarker information occurs at the level of individualmolecules, interactions between molecules, and molecular complexes. Suchindividual molecules, molecular interactions, and/or molecular complexescan be detected by flow cytometry, single molecule electrophoresis,ion-channel switch membrane biosensor, or other single-moleculeanalytical instrumentation. Single molecule information can be cumulatedover multiple molecular events, providing dynamic quantification ofbiomarker levels within a clinical sample, allowing the sparing use ofvery small samples. Data acquisition of such events may be halted when asufficient number of events are received within a given sample volume toreliably quantitate (e.g. reliably here meaning with a coefficient ofvariation of 20% or less) a given biomarker's concentration using apresumed Poisson or binomial probability distribution function, as knownby one skilled in the art. Such dynamic quantitation of very smallsample volumes is a key aspect of the invention as practiced usingsingle molecule detectors.

Accordingly, embodiments of the present invention contemplate thespecific application of single molecule detection to the development ofdiagnostic tests based on legacy clinical sample sets. Namely, it hasbeen determined by the present inventors that single molecule detectioncan detect the presence of biomarker or levels thereof with a suitablesensitivity using only about 1 uL or less of sample per single-biomarkerimmunoassay (for example). Any suitable analyte recognition unit (e.g.,antibodies, aptamers, molecular imprints, probes, primers etc. whichhave differentially greater affinity for a biomarker of interest) andsignal detection technique can be used with a single molecule detectionreader in accordance with the present invention. Additionally, it willbe understood that the present invention is not limited to the use ofimmunoassays. Thus, for example, to develop a diagnostic test based onan initial subset of (for example) 300 biomarkers, the use of singlemolecule detection allows requires a sample size of only about 0.3 mL(i.e., 1 uL per assay*300 biomarkers), or about 0.9 mL if the assays aredone in triplicate. The assay may use a 96-well, 384-well format or anyother suitable assay configuration. Any multiplexing within the assaywill only further reduce the required sample size. The present inventorshave applied this knowledge to the discovery that diagnostic tests canbe developed based on legacy clinical samples which, as described, aretypically available in sizes of 0.05 to 1.0 mL or less. Additionaldetails regarding an illustrative single molecule detection system areprovided below.

In some embodiments, the single molecule detection system can rely onsingle-molecule fluorescence. Thus, in such embodiments, no polymerases,enzymes or proteins, or any amplification processes are necessary sosample preparation times and complexity are minimal. In otherembodiments, the single molecule detection may utilize labeledantibodies. Such labels for individual antibody (or other suitablebiomarker recognition units) may themselves be constructed of aplurality of individual fluorescent molecules, further amplifying thesignal derived from each single complex multi-fold, and further reducingthe detection technique requirements for single molecule detection (suchmultiplexing of fluorophores may be achieved using beads, dedrimers,polysaccharides and other natural and synthetic polymers, amongst othertechniques well described in the art). In one embodiment, the basicdetection apparatus may comprise one or two lasers (or a single lasersource split into two beams), focusing light-collection optics, one ortwo single photon detectors, and detection electronics under computercontrol. FIGS. 1 and 2 are illustrative diagrams of a single moleculedetector in accordance with an illustrative, but non-limiting,embodiment of the present invention. A sample compartment is alsoincluded and may comprise two reservoirs that hold the solution beinganalyzed. The reservoirs can be connected by tubing to a glass capillarycell.

The system also may include a glass capillary flow cell. For example,two laser beams (5 um in diameter) are optically focused about 100 umapart and perpendicular to the length of the sample-filled capillarytube. The lasers generally are operated at particular wavelengthsdepending upon the nature of the detection probe to be excited. Aninterrogation volume of the detection system may be determined by thediameter of the laser beam and by the segment of the laser beam selectedby the optics that direct light to the detectors. The interrogationvolume is preferably set such that, with an appropriate sampleconcentration, single molecules (such as single biomarker-recognitionunit hybrids, single nucleic acid probes or single probe-target hybrids)are present in the interrogation volume during each time interval overwhich observations are made. Another embodiment of an apparatus for usein accordance with the present invention uses the same capillary flowcell and detection system, but only uses a single laser beam anddetector.

With the above-described instrument configuration (5 um laser beam)approximately 0.25% of the fluorescent molecules in the solution passthrough the laser beams and are typically detectable. This percentagecan be increased by configuring each laser beam such that it forms anarrow band perpendicular to the length of the capillary. Such anarrangement can raise the percentage of detectable molecules toapproximately 5% of the molecules in the solution. Other configurationsilluminating larger areas of the capillary have been calculated toenable detection of up to (for example) 50% of the fluorescent moleculespresent in a sample. The device has the capability of detecting singlemolecules in real time, allowing the detection of a fixed number ofcounts independent of time, and enabling dynamic quantification andconcentration range finding during the course of the initial detectionperiod. This feature allows faster readouts of samples as setting acount threshold (for example, at 1000 molecular events or such othereffective level, giving a statistically valid quantitation of abiomarker within a sample) is often much faster than a fixed time point(1 minute). For higher biomarker concentrations, preparatory sampledilution may nonetheless be required in order to avoid reaching thecount threshold too rapidly in such single molecule detectorconfigurations.

FIG. 3 is a flowchart of illustrative, exemplary stages involved indeveloping a diagnostic test in accordance with some of the embodimentsof the present invention, including: identification of biomarkercandidates, sourcing of reagents, assay development, procurement ofclinical samples, interrogation of clinical samples with biomarkerassays, and analysis of the data to identify predictive markers andincorporate the results into predictive tests. These illustrative stagesare described in greater detail below.

Identify biomarkers: Biomarkers may be identified by way of acomprehensive search through scientific and patent literature,supplemented with expert review. Based on an understanding of biologicalmechanisms associated with progression in a given disease area, standardsearch terms are developed to generate disease-specific databasescontaining typically thousands of journal articles and hundreds ofpatents. Cannonical pathways, homology, and linkage studies arealternative means of identifying putative biomarkers for a given diseasestate, as are cell line and animal experiments utilizing mRNA expressionunder response to stimuli, active agents (drugs, siRNAs, etc.), or indisease-specific organisms (knock-outs, nude mice, ApoE deficient mice,etc.) as are well known to those versed in the art of biomarkerdiscovery. Analytical techniques on larger sample volumes, or pooledsample volumes, may also be used as in Granger, et al. Discovery ofProteins Related to Coronary Artery Disease Using Industrial-ScaleProteomics Analysis of Pooled Plasma, American Heart Journal v152 (3)September 2006, which is hereby incorporated by reference herein in itsentirety. Each article and patent is read to identify candidates whichare organized in a spreadsheet. For each biomarker, standardizednomenclature derived from human genome databases is applied to eliminateredundancy and enter standardized annotations.

A score for evidence level is assigned to prioritize the potential valueof each biomarker based on experimental data. The evidence level may becombined with protein cellular expression localization to create anoverall prioritized list of biomarkers for each disease. At the end ofthis process, the list of candidates is typically 150-400 biomarkers,but may be more or less. Illustrative lists of biomarkers for use indeveloping diagnostic tests for diabetes and osteoporosis are describedin U.S. Provisional Patent Application Nos. 60/725,462, filed Oct. 11,2005, 60/771,077, filed Feb. 6, 2006, Ser. No. 11/546,874, filed Oct.11, 2006, Ser. No. 11/703,400, filed Feb. 6, 2007, and U.S. applicationSer. No. 11/788,260, filed Apr. 18, 2007, titled “Diabetes-AssociatedMarkers and Methods of Use Thereof” and bearing attorney docket no.24748-502 CIP, which are all hereby incorporated by reference herein intheir entireties.

Source Reagents: Table 1 below shows a large and diverse array ofvendors that may be used to source immunoreagents as a starting pointfor assay development. Using the prioritized list of markers, a searchfor capture antibodies, detection antibodies, and analytes may beperformed that can be used to configure a working sandwich immunoassay.

For example, in one disease area, diabetes, 156 of 208 biomarkers weresuccessfully sourced. Depending on the specific disease area, it isanticipated that anywhere from 50 to 80% of the biomarkers on any listare available from commercial sources. The reagents are ordered andreceived into inventory.

TABLE 1 Immunoreagent Vendors Company Abazyme AbCam AbGent AbKem AbnovaAbsea Biotechnology Academy Biomed Accurate Chemical and ScientificCorporation Acris Advanced Immunochemical, Inc. Advanced TargetingSystems Affibody Affiniti Research Products Limited Affinity BiologicalsAffinity Bioreagents Alexis Biochemicals Alomone Labs Alpha DiagnosticIntl. AlphaGenix American Diagnostica Inc. American Qualex AmericanResearch Products American Type Culture Collection Anaspec ANAWA TradingSA Ancell AngioBio Angio-Proteomie Aniara Anogen Antibodies IncorporatedAntibodyBcn AntibodyShop Apotech APTEC Diagnostics Araclon Biotech AssayDesigns Athens Research and Technology Austral Biologicals Aves LabsAviva Antibody Axxora Babraham Technix Bachem Beckman Coulter, Inc.Bender Medsystems Bethyl Laboratories Bio Research Canada BioCoreBioCytex Biodesign International Biogenesis BioGenex BioLegend BiomarketBiomeda Corporation Biomedical Technologies BIOMOL InternationalBioProcessing Biosense Laboratories BioSepra Biosonda BioSourceInternational BiosPacific Biostride Biotrend Biovendor LaboratoryMedicine Biovet BMA Biomedical Boston Biochem Brendan ScientificCalbiochem Caltag Cambio CanAg Diagnostics Capralogics CapricornProducts Cayman Chemical Company Cedarlane Laboratories Cell Marque CellSciences Cell Signaling Technology Cemines Chang Bioscience ChemiconInternational Chemokine Clonegene Clontech Cortex Biochem CovanceResearch Products Cytolab Cytopulse Cytoshop CytoStore DAKO DeltabiolabsDevelopment Studies Hybridoma Bank Diaclone Diagnostic BioSystemsDiagnostic Systems Laboratory Diasorin Diatec Dolfin Dutch DiagnosticsEast Coast Biologics eBioscience Echelon Research Laboratorie ECMBiosciences EnCor Biotechnology Endocrine Technologies Enzo BiochemEpitomics Euroclone Euro-Diagnostica Eurogentec Everest Biotech ExalphaEXBIO Praha EY Laboratories FabGennix Int. Fitzgerald IndustriesInternational Fortron Bio Science Fusion Antibodies FutureImmuneImmunologic Technical and Consulting Services Gallus ImmunotechG-Biosciences GEMAC Genesis Biotech G-Biosciences Genex GenhotLaboratories Genway Biotech GloboZymes Good Biotech Green MountainAntibodies Groovy Blue Genes Biotech Haematologic Technologies HamptonResearch Histoline Laboratoires HyCult Biotechnology HyTest IBL IBT IDSImgenex IMMCO Diagnostics Immunodetect Immunodiagnostik ImmunoGlobeAntikoerpertechnik ImmunoKontact ImmunologicalsDirect ImmunologyConsultants Laboratory Immunometrics Immuno-Precise Services ImmunostarImmunostep ImmunoTools Immunovision Immuquest Biogenex InnovaBiosciences Innovation Automation Innovex Insight BiotechnologyInternational Enzymes Invitek Invitrogen IQ Products Isconova ISL(Immune Systems Ltd) Jackson ImmunoResearch Laboratory KCH ScientificKirkegaard & Perry Laboratorie KMI Diagnostics Koma Biotech KordiaLaboratory Supplies Lab Vision Corporation LabFrontier Life ScienceInstitute LAE Biotechnology Company Lampire Biological Laborator LeeLaboratories Leinco Technologies Lifescreen Linco Research MaineBiotechnology Services MBL International Mediclone Medix BiochemicaMedSystems Diagnostics GmbH MicroPharm Ltd. MilleGen MitoSciencesMoBiTec ModiQuest Molecular Innovations Molecular Probes MP BiomedicalsMubio Products NatuTec Neoclone Neuromics New England Biolabs NordicImmunological Laboratories Norrin Laboratories Novocastra NovusBiologicals OEM Concepts, Inc. Oncogene Research Products OpenBiosystems Orbigen Oxford Biotechnology Pacific Immunology PallCorporation Panvera PBL Biomedical Laboratories Peprotech, Inc.PerkinElmer Life Sciences Perseus Proteomics Pharmingen PhoenixPharmaceuticals PickCell Laboratories Pierce Chemical Company PlasmaLabInternational, Inc. Polymun Scientific Polysciences, Inc. PRF&L Pro-ChemProgen Promab Biotechnologies Promega Corporation ProSci ProteogenixProtos Immunoresearch QED Biosciences, Inc. Quidel Corporation R&DSystems Randox Repligen Research Diagnostics Roboscreen RocklandImmunochemicals Rose Biotech Santa Cruz Biotechnology SCIpac ScottishAgricultural Science ScyTek Laboratories Seikagaku America SeramonSerological Corporation Serotec SigmaAldrich Signature ImmunologicsSignet Laboratories Silver Lake Research Southern BiotechnologyAssociates SPI-BIO Statens Serum Institut StemCell TechnologiesSterogene Bioseparations Strategic Biosolutions Stressgen StructureProbe, Inc. (SPI) SWant Synaptic Systems GmbH SynthOrg Biochemicals,Ltd. Technopharm Terra Nova Biotechnology Tetra Link International TheBiotech Source TiterMax Transmissible Spongiform Encephalopothy ResearchCenter Trevigen Trillium Diagnostics Triple Point Biologics TulipBiolabs Union Stem Cell & Gene Engineering Company Upstate BiotechnologyUS Biological Vector Laboratories Ventana Medical Systems, Inc VisionBioSystems Wako Pure Chemical Industrie WolwoBiotech Company Zeptometrix

Develop Immunoassays: Immunoassays are preferably developed in threesteps, Prototyping, Validation, and Kit Release.

Prototyping: Prototyping may be done using standard ELISA formats if thetwo antibodies used in the assay are from different host species. Usingstandard conditions, anti-host secondary antibodies conjugated withhorse radish peroxidase are evaluated in a standard curve. If a goodstandard curve is detected, the assay proceeds to the next step. Assaysthat have same host antibodies go directly to the next step (i.e., mousemonoclonal sandwich assays).

Validation: Validation of a working assay may be performed using singlemolecule detection technology. The detection antibody is firstconjugated to fluorescent molecules, typically Alexa 647. Theconjugations use standard NHS ester chemistry, for example, according tothe manufacturer. Once the antibody is labeled, the assay is tested in asandwich assay format using standard conditions. Each assay well issolubilized in a denaturing buffer, and the material read on the singlemolecule detection platform.

FIG. 2 shows a typical result for a working standard curve. Once aworking standard curve is demonstrated, the assay may be applied to 24serum samples (for example) to determine the normal distribution of thetarget analyte across clinical samples. The amount of serum required tomeasure the biomarker within the linear dynamic range of the assay isdetermined, and the assay proceeds to kit release. In the presentexample, based on 39 validated assays, 0.004 microliters are used perwell on average.

Kit Release: Each component of the kit including manufacturer, catalognumbers, lot numbers, stock and working concentrations, standard curve,and serum requirements may be compiled into a standard operatingprocedures for each biomarker assay. This kit may then be released foruse in testing clinical research samples.

Acquiring Clinical Samples: Depending on the specification of thediagnostic test being developed, the clinical samples preferably have(for example) clinical annotations that track progression of disease,and preferably also include measurements of underlying mechanisms ordisease phenotypes, and/or have disease outcomes using longitudinalsamples over time. Relationships with the investigators may then bedeveloped, and a contractual agreement is put into place. For eachclinical study, the typical volumes range from 0.1 to 1 mL.

Import Clinical Annotations: Samples arrive frozen on dry ice, and eachsample is stored at −80 C. Each sample typically has tens to hundreds ofclinical annotations associated with it. The clinical annotationsassociated with each sample set may be brought into a standardizednomenclature prior to import. All of the clinical annotations associatedwith each sample are then imported into a relational database.

Prepare Clinical Samples: The frozen aliquots are thawed and aliquottedfor use in the laboratory. Each clinical sample is thawed on ice, andaliquots are dispensed into barcoded tubes (daughter tubes). Eachdaughter tube is stored at −80 C until it is needed for immunoassays.The daughter tubes are then arrayed into sample plates. Each barcodeddaughter tube to be assayed is arrayed into barcoded 96 or 384 wellplates (sample plates). This daughter tube to sample plate well mappingis tracked by the relational database.

Run Immunoassays: Each sample plate is now prepared for immunoassays. Inone example, 384 well barcoded assay plates may be dedicated to onebiomarker per plate. Typically, 4-12 assay plates are derived from eachsample plate dependent on the amount of serum required for each assay.The sample plate goes through a series of dilutions to ensure that theclinical samples are at an appropriate dilution for each immunoassay.The clinical samples are then deposited into the assay plate wells intriplicate for each marker. Again, tracking of each sample plate well toassay plate well is tracked in the relational database. The assays maythen be processed using standard immunoassay procedures, and the assayplate is read on a single molecule detection instrument. Each runcontains data for a single biomarker across multiple clinical samples,typically around one hundred. The resulting data files may then beimported back into the relational database, where standard curves can becalculated and the concentration values for each biomarker for eachsample can be calculated. FIG. 3 shows an example of single moleculedetection data across 92 samples for 25 biomarkers.

Analyze Data: The quantitative biomarker data can now be correlated tothe clinical annotations associated with each sample. Any number ofstatistical formula or machine learning approaches on single or multiplemarkers can be used to identify disease states or risk for disease orbiomarker patterns that have commercial potential to diagnose orprognose disease state (for example).

The following is an illustrative example of a Standard OperatingProcedure (SOP) for use in developing diagnostic tests in accordancewith an embodiment of the present invention.

Assay Analyte: C-Reactive Protein Components:

Component Vendor Catalog Number Lot Number C-Reactive Protein USBiologicals C7907-26A L5042910 Capture Antibody US Biologicals C7907-09L4030562 Detection Antibody US Biologicals C7907-10 L2121306M

-   1. Plate Coating: Coat and Block immunoassay plates for analyte    capture    -   1.1 Materials        -   1.1.1. NUNC Maxisorp 384 well plates, Cat. No. 460518        -   1.1.2. NUNC Acetate Sealers, Cat. No. 235306        -   1.1.3. Coating buffer            -   1.1.3.1. 0.05 M carbonate, pH 9.6            -   1.1.3.2. Store at 4° C. for up to 2 months        -   1.1.4. Capture Antibody        -   1.1.5. Wash buffer A        -   1.1.5.1. PBS with 0.1% TWEEN 20            -   1.1.5.2. Store at room temperature for up to 2 months        -   1.1.6. Blocking buffer            -   1.1.6.1. 1% BSA, 5% sucrose, 0.05% NaN₃ in PBS            -   1.1.6.2. Store at 4° C. for up to 1 month        -   1.1.7. Microplate washer    -   1.2. Procedure        -   1.2.1. Dilute capture antibody to 1 microgram/mL in coating            buffer. (Prepare immediately before use)        -   1.2.2. Add 20 microliters of diluted capture antibody per            well        -   1.2.3. Seal and shake for 2 minutes on plate shaker        -   1.2.4. Centrifuge 1000rpm 2 min, 25° C.        -   1.2.5. Incubate overnight at room temperature (no shaking)        -   1.2.6. Wash 3× with 100 microliters wash buffer A        -   1.2.7. Add 30 microliters blocking buffer per well        -   1.2.8. Seal and shake for 2 minutes on Jitterbug setting 7        -   1.2.9. Centrifuge 1000rpm 2 min, 25° C.        -   1.2.10. Incubate at least 2 hour at room temperature (no            shaking)        -   1.2.11. Dump plate and blot upside down (no wash)        -   1.2.12. Air dry the blocked plates (uncovered) at least 5            hours at room temperature        -   1.2.13. Cover the dry plates with acetate sealer        -   1.2.14. Store at 4° C. for up to one month-   2. Single Molecule Detection Assay: Add clinical samples to coated    plates and quantify    -   2.1. Materials        -   2.1.1. Coated, blocked NUNC Maxisorp 384 well plate        -   2.1.2. NUNC Acetate Sealers, Cat. No. 235306        -   2.1.3. Assay buffer            -   2.1.3.1. BS* with 1% BSA, 0.1% TRITON X-100            -   2.1.3.2. Store 4° C. for up to 1 month        -   2.1.4. Standard Calibrator diluent            -   2.1.4.1. Assay buffer+additional 5% BSA,            -   2.1.4.2. Enough volume for standard curve, including 0                pg/ml.            -   2.1.4.3. Make fresh for use.        -   2.1.5. Standard Molecule Control        -   2.1.6. Detection Antibody: A647 labeled antibody        -   2.1.7. Assay Wash buffer B            -   2.1.7.1. BS* with 0.02% TRITON X-100 and 0.001% BSA            -   2.1.7.2. 500 ml per assay plate            -   2.1.7.3. Store at 4° C. for up to 1 month        -   2.1.8. Elution Buffer            -   2.1.8.1. 4 M urea, 1× BS with 0.02% TRITON X-100 and                0.001% BSA            -   2.1.8.2. Approx 8 ml per assay plate        -   2.1.9. Microplate shaker (Jitterbug), set at “7”        -   2.1.10. Microplate washer        -   2.1.11. Centrifuge    -   2.2. Procedure    -   2.3. Record        -   2.3.1. Plate assay plate number, kit lot number, and sample            plates used    -   2.4. Standard Curve        -   2.4.1. Dilute control to 100 ng/ml in calibrator        -   2.4.2. Prepare ½ serial dilutions from 100 ng/ml to 0.01            pg/ml in calibrator diluent        -   2.5. Sample Dilution        -   2.5.1. Dilute samples 1:400 in assay buffer    -   2.6. Capture and Detection        -   2.6.1. Add 20 microliters/well of standards        -   2.6.2. Add 20 microliters/well diluted unknowns        -   2.6.3. Seal w/ acetate sealing tape. Shake for 2 minutes on            plate shaker        -   2.6.4. Incubate overnight at room temperature        -   2.6.5. Dilute Detection antibody labeled A647 antibody to 50            ng/ml in assay buffer.        -   2.6.6. Aspirate. Wash 5× with 100 ul wash buffer B        -   2.6.7. Blot upside down        -   2.6.8. Add 20 microliters/well diluted detection antibody        -   2.6.9. Seal w/ acetate sealing tape. Shake for 2 minutes on            plate shaker        -   2.6.10. Incubate 2 hours at room temperature        -   2.6.11. Aspirate. Wash 5× with 100 ul wash buffer B        -   2.6.12. Blot upside down        -   2.6.13. Add 20 microliters/well elution buffer.        -   2.6.14. Seal w/ acetate sealing tape. Shake for 2 minutes on            plate shaker.        -   2.6.15. Incubate ½ hour at 25° C.        -   2.6.16. Centrifuge on 1000 rpm for 2 min, 25° C.    -   2.7. Analyze on Single Molecule Detection instrument

Other Embodiments

It is to be understood that while the invention has been described inconjunction with the detailed description thereof, the foregoingdescription is intended to illustrate and not limit the scope of theinvention, which is defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

What is claimed is:
 1. A single detection means of evaluating the healthstate of a human subject, comprising obtaining a measurement of a leastone clinical biomarker from at least one live clinical sample isolatedfrom said human subject, and inputting said measurement(s) into a modelthat calculates an output value correlated to said health state, theimprovement comprising using as said model an algorithm that wasdeveloped by measurement of multiple development biomarkers comprisingsaid clinical biomarker(s) from at least one legacy clinical sample setannotated for said health state, said measurement comprising use oflegacy clinical samples having a sample volume of 1 ml or less, andanalyzing said measurement of multiple biomarkers for an associationwith said health state.
 2. The means of claim 1, wherein saidmeasurement of multiple development biomarkers comprises measuring atleast two biomarkers from said sample volume.
 3. The means of claim 2,wherein said measurement of multiple development biomarkers comprisesmeasuring at least 10 biomarkers from a sample volume less than about0.5 milliliters.
 4. The means of claim 2, wherein said measurement ofmultiple development biomarkers comprises measuring at least 20biomarkers from said sample volume.
 5. The means of claim 2, whereinsaid measurement of multiple development biomarkers comprises measuringat least 100 biomarkers from said sample volume.
 6. The means of claim2, wherein said measurement of multiple development biomarkers comprisesmeasuring at least 200 biomarkers from said sample volume.
 7. The meansof claim 2, wherein said measurement of multiple development biomarkerscomprises measuring at least 300 biomarkers from said sample volume. 8.The means of claim 1, wherein said measurement of multiple developmentbiomarkers comprises, for each biomarker, measuring said biomarker in anassay of said legacy clinical sample, wherein said assay used about 1microliter (μL) or less of said sample volume for each biomarker.
 9. Themeans of claim 8, wherein at least 10 development biomarkers aremeasured per legacy clinical sample.
 10. The means of claim 8, whereinsaid measurement of multiple development biomarkers uses single moleculedetection to measure said multiple development biomarkers in said legacyclinical sample.
 11. The means of claim 10, wherein said measurement ofbiomarkers in the legacy clinical sample by single molecule detectionconsists of dynamic quantitation.
 12. The means of claim 1, wherein saidhealth state is the presence or absence of a disease.
 13. The means ofclaim 1, wherein said health state is the pre-disease or pre-diseasecondition.
 14. The means of claim 1, wherein said health state is therisk of developing a disease.
 15. The mans of claim 12, wherein saidabsence of a disease is further defined to be a normal state orpre-disease state.
 16. The means of claim 1, wherein said biomarkerscomprise traditional laboratory risk factors.
 17. The means of claim 1,wherein said live clinical sample isolated from said human subject iswhole blood, serum, plasma, blood cells, endothelial, cells, tissuebiopsies, lymphatic fluid, ascites fluid, interstitial fluid, bonemarrow, cerebrospinal fluid, saliva, sputum, sweat, or urine.
 18. Themeans of claim 17, wherein said live clinical sample is plasma or serum.19. The means of claim 17, wherein said sample is from a human subjectundergoing one or more treatment regimens.
 20. The means of claim 19,wherein said treatment regimens are selected from a group consisting oftherapeutics, prophylactics, exercise regimens, dietary supplementation,weight loss, surgical intervention, device implantation and exerciseregimens.